18 research outputs found

    A Survey of State Merging Strategies for DFA Identification in the Limit

    Get PDF
    Identication of deterministic nite automata (DFAs) has an extensive history, both in passive learning and in active learning. Intractability results by Gold [5] and Angluin [1] show that nding the smallest automaton consistent with a set of accepted and rejected strings is NP-complete. Nevertheless, a lot of work has been done on learning DFAs from examples within specic heuristics, starting with Trakhtenbrot and Barzdin's algorithm [15], rediscovered and applied to the discipline of grammatical inference by Gold [5]. Many other algorithms have been developed, the convergence of most of which is based on characteristic sets: RPNI (Regular Positive and Negative Inference) by J. Oncina and P. García [11, 12], Traxbar by K. Lang [8], EDSM (Evidence Driven State Merging), Windowed EDSM and Blue- Fringe EDSM by K. Lang, B. Pearlmutter and R. Price [9], SAGE (Self-Adaptive Greedy Estimate) by H. Juillé [7], etc. This paper provides a comprehensive study of the most important state merging strategies developed so far

    User interaction modeling and profile extraction in interactive systems : a groupware application case study

    Get PDF
    Abstract A relevant goal in human-computer interaction is to produce applications that are easy to use and well-adjusted to their users' needs. To address this problem it is important to know how users interact with the system. This work constitutes a methodological contribution capable of identifying the context of use in which users perform interactions with a groupware application (synchronous or asynchronous) and provides, using machine learning techniques, generative models of how users behave. Additionally, these models are transformed into a text that describes in natural language the main characteristics of the interaction of the users with the system.This work was partially supported by project PAC::LFO (MTM2014-55262-P) of Ministerio de Ciencia e Innovación (MICINN), Spain. We are grateful to the referees for their constructive input

    Behavioral Modeling Based on Probabilistic Finite Automata: An Empirical Study

    Get PDF
    Imagine an agent that performs tasks according to different strategies. The goal of Behavioral Recognition (BR) is to identify which of the available strategies is the one being used by the agent, by simply observing the agent?s actions and the environmental conditions during a certain period of time. The goal of Behavioral Cloning (BC) is more ambitious. In this last case, the learner must be able to build a model of the behavior of the agent. In both settings, the only assumption is that the learner has access to a training set that contains instances of observed behavioral traces for each available strategy. This paper studies a machine learning approach based on Probabilistic Finite Automata (PFAs), capable of achieving both the recognition and cloning tasks. We evaluate the performance of PFAs in the context of a simulated learning environment (in this case, a virtual Roomba vacuum cleaner robot), and compare it with a collection of other machine learning approaches.This work was partially supported by project PAC::LFO (MTM2014-55262-P) of Programa Estatal de Fomento de la Investigación Científica y Técnica de Excelencia, Ministerio de Ciencia e Innovación (MICINN), Spain, and by the National Science Foundation (NSF) project SCH-1521943, USA

    Global optimality in k-means clustering

    Get PDF
    Abstract: We study the problem of finding an optimum clustering, a problem known to be NP-hard. Existing literature contains algorithms running in time proportional to the number of points raised to a power that depends on the dimensionality and on the number of clusters. Published validations of some of these algorithms are unfortunately incomplete; besides, the constant factors (with respect to the number of points) in their running time bounds have seen several published important improvements but are still huge, exponential on the dimension and on the number of clusters, making the corresponding algorithms fully impractical. We provide a new algorithm, with its corresponding complexity-theoretic analysis. It reduces both the exponent and the constant factor, to the extent that it becomes feasible for relevant particular cases. Additionally, it parallelizes extremely well, so that its implementation on current high-performance hardware is quite straightforward. Our proposal opens the door to potential improvements along a research line that had no practical significance so far; besides, a long but single-shot run of our algorithm allows one to identify absolutely optimum solutions for benchmark problems, whereby alternative heuristic proposals can evaluate the goodness of their solutions and the precise price paid for their faster running times

    MEM and MEM4PP: New Tools Supporting the Parallel Generation of Critical Metrics in the Evaluation of Statistical Models

    No full text
    This paper describes MEM and MEM4PP as new Stata tools and commands. They support the automatic reporting and selection of the best regression and classification models by adding supplemental performance metrics based on statistical post-estimation and custom computation. In particular, MEM provides helpful metrics, such as the maximum acceptable variance inflation factor (maxAcceptVIF) together with the maximum computed variance inflation factor (maxComputVIF) for ordinary least squares (OLS) regression, the maximum absolute value of the correlation coefficient in the predictors’ correlation matrix (maxAbsVPMCC), the area under the curve of receiving operator characteristics (AUC-ROC), p and chi-squared of the goodness-of-fit (GOF) test for logit and probit, and also the maximum probability thresholds (maxProbNlogPenultThrsh and maxProbNlogLastThrsh) from Zlotnik and Abraira risk-prediction nomograms (nomolog) for logistic regressions. This new tool also performs the automatic identification of the list of variables if run after most regression commands. After simple successive invocations of MEM (in a .do file acting as a batch file), the collectible results are produced in the console or exported to specially designated files (one .csv for all models in a batch). MEM4PP is MEM’s version for parallel processing. It starts from the same batch (the same .do file with its path provided as a parameter) and triggers different instances of Stata to parallelly generate the same results (one .csv for each model in a batch). The paper also includes some examples using real-world data from the World Values Survey (the evidence between 1981 and 2020, version number 1.6). They help us understand how MEM and MEM4PP support the testing of predictor independence, reverse causality checks, the best model selection starting from such metrics, and, ultimately, the replication of all these steps

    Teaching A Virtual Robot To Perform Tasks By Learning From Observation

    No full text
    We propose a methodology based on Learning from Observation in order to teach a virtual robot to perform its tasks. Our technique only assumes that behaviors to be cloned can be observed and represented using a finite alphabet of symbols. A virtual agent is used to generate training material, according to a range of strategies of gradually increasing complexity. We use Machine Learning techniques to learn new strategies by observing and thereafter imitating the actions performed by the agent. We perform several experiments to test our proposal. The analysis of those experiments suggests that probabilistic finite state machines could be a suitable tool for the problem of behavioral cloning. We believe that the given methodology is easy to integrate in the learning module of any Ubiquitous Robot Architecture
    corecore